Wuquan Wang, Tianjin University,
China, wwq0806@tju.edu.cn PRIMARY
Jie Xu, Tianjin University,
China, xujie_nm@tju.edu.cn
Dongni Hu, Tianjin
University, China, dongnihu@tju.edu.cn
Panjiao Yan, Tianjin University,
China, yanpanjiao@tju.edu.cn
Zhenbao Fan, Tianjin University,
China, fanzhenbao@tju.edu.cn
Jie Li, Tianjin University, China, vassilee@tju.edu.cn
SUPERVISOR
Kang Zhang, Tianjin University, China,
kzhang@tju.edu.cn SUPERVISOR
Student Team: YES
Did
you use data from both mini-challenges?
NO
D3
MYSQL
C++
Excel
SPSS
Approximately how many hours were spent working on
this submission in total?
About
60 hours (60 days and 1 hours per day)
May we post your submission in the Visual Analytics
Benchmark Repository after VAST Challenge 2015 is complete?
YES
Video:
index.files\TJU-Wang-MC1-Demo.wmv
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Questions
MC1.1 – Characterize the attendance at
DinoFun World on this weekend. Describe up to twelve different types of groups
at the park on this weekend.
Limit your response to no more than 12 images and 1000
words.
Before answering the question on group characteristics, we first
identify groups of individuals using statistical clustering methods. We put any
two individuals into one group if they were together for more than 7th
of the entire period in the park. We next allocate all the groups into 12
different types based on the average check-in numbers in each facility, using
the K-means clustering algorithm. Finally, we visualize the clustering results
to analyze whether our strategy is correct.
a. The size of each type of groups.
Instead
of merging three days’ data, we process the daily data separately. This is
because, many people borrowed devices in the park, but not directly installed
the app on their own phones. Moreover, if we see an ID on Friday and again on Saturday,
it is the same person or are they only unique in-day.
As shown in Figure 1-1, there are 12 different types of groups in 12 different colors. Each circle represents a group and its size represents the number of individuals in the group. The black arrow at a circle generates a popup showing the “typeID”, “typeSize” representing the total number of people within this type, “number of groups” for the total number of groups of this type, “Min” for the smallest group size and “Max” for the largest group size within this type. We find that, in all 12 types of groups, the smallest groups all contain 2 people. We can see the check-in information of different groups in the radar chart.
Fig.1-1 12 different types of groups
b. Places where
different types of groups like to go.
Fig.
1-2-1 shows places where different types of groups like to go in the park. X-axis represents 12 classified types of
groups. Y-axis represents 6 places
in the park. Z-axis represents the average number of visits per person to each
place.
First, we
find that most types of groups like to visit Thrill Rides.The average number of
visits to Thrill Rides is significantly higher than those to other places.
Fig.1-2-1 Distribution of
types of groups in all facilities
Second, we look into individual types. Figure 1-2-2 shows the information
of the 3rd type on Friday, 8rd type on Saturday, 7rd type on Sunday. In contrast, the 8rd type on Sunday is more interested in the shows.
Fig.1-2-2 The 3rd type on Friday, 8rd type on Saturday, 7rd type on Sunday
c. The commonality
of each type of groups.
It is necessary to discuss the commonality
of each type and difference among different types of groups.
Figure 1-3-1(a)shows how many people in each type
of groups during the three days.
Figure 1-3-1(b)demonstrates how many groups in each type.
Figure 1-3-1(c)shows how many individuals in the largest group of each type.
Fig.1-3-1(a)GroupSize;(b) GroupNum;(c) Max
Fig.1-3-2 shows an example: the commonality and
differences between the 8rd type (purple) and the 12rd type (pink) on their Friday’s activities.
Fig. 1-3-2 The 8rd and 12rd type on Friday
d. Other
observations on different types of groups.
Using force layout(Figure 1-4), we can observe that most types of groups like to
visit all the facilities. A few types of groups like to visit specific
facilities.
Fig.1-4 (a) Fri Sat force mapping; (b) Sat
Sun force mapping; (c) Sun force mapping
e. Inference on
the types of groups.
1. Based on the above information, we can infer that the
types preferring Thrill Rides but no interested in other facilities should be
young people.
2. The type of groups interested in Kiddie Rides may be parents
and children.
3. The type only paying attention to the daily shows may be
fans of Scott Jones.
f. Suggestions to
the park to better meet this type’s needs
In the DinoFun World, the most popular facility is
Thrill Rides while the most unpopular one is Kiddie Rides. The number of people
visiting Thrill Rides was increasing from Friday to Sunday, while more people
visited Kids Rides on Friday than those on Saturday and Sunday.
Given these findings, we suggest to increase the
number of Thrill facilities to reduce the queuing time and attract more
visitors. Wherever large gathering occurs, it becomes dangerous. We therefore
suggest to add more security staff in the popular areas to ensure the safety of
the visitors. During the weekend or show time, more security staff should be on
duty.
MC1.2 – Are there notable differences in the patterns of activity on in
the park across the three days?
Please describe the notable difference you see.
Limit your response to no more than 3 images
and 300 words.
1. A significantly larger number of people visited the
show in the weekend than on Friday. We infer that since the show was once in
the morning and once in the afternoon, the difference in the number of people
visiting the show in 3 days is less obvious than the difference with other
types of facilities.
Fig.2-1 Left view during
three days
2.The
average number of people visiting Kid Rides on Sunday and Saturday are less than
that on Friday.
3.We wrote
a program to generate a heat map on top of the park map to show the population
densities in different parts of the park during three days. Setting the time
unit as 1 minute, color for densities ranging from green (sparse) to red (most
dense). Figure 2-2 is a snapshot at
Sunday 9:30 a.m., showing a few large points, which correspond to the facility
entrances. The points are color-coded by their population density, ranging from
gray to red. The heat map is animated over time, demonstrating what is happening
in each facility.
Fig. 2-2 Heat map at Sunday 9:30 a.m.
We find
that Saturday is the only day, when a person showed movement inside the park
but no check-in at any of the three
gates. Fig. 2-3 displays the difference on the three mornings.
Fig. 2-3 Difference in the three
mornings
MC1.3 – What anomalies or unusual patterns do you see? Describe no more
than 10 anomalies, and prioritize those unusual patterns that you think are
most likely to be relevant to the crime.
Limit your response to no more than 10 images
and 500 words.
1. Some people did not play any facilities all day long,
or played only one or two. We suspect their purpose of entering to the park.
Fig.3-1 Fri force mapping
2.
The person 657863 who left on
Saturday morning stayed in the park throughout Friday night. We find that he
was part of a group that visited Alvarez Beer Garden on Friday. The other group
members, however, left the Garden, leaving him behind. We make two assumptions:
(1) he and his friends drank beer in the park on Friday evening and he was
drunk and left behind in the park; (2) this group related to the crime, and
this person stayed to prepare for it.
3.
Having observed the No.63
facility and made a linear regression on the data sets of the three days, we
find that the similarity between any two groups is larger than 0.8. The
difference appears on Saturday afternoon, when the data is different from
others, with the similarity less than
0.5.
Fig.3-2 (a) Fri 63 Check-in; (b) Sat 63 Check-in; (c) Sun 63 Check-in
4.
As mentioned before, we find that one person 657863 showed
only the movement record without check-in at all.
5.
Another anomaly is that Pavilion No. 32 was closed for a
while on Sunday.
Fig.3-3 Sun 32 Check-in
6.
IDs 103006, 313073, 657863, 1412235, 1937843 of the 11th
group and IDs 521750, 644885, 1080969, 1600469, 1629516, 1781070, 1787551,
1935406 of the 550th group, both of Type 11, didn’t play any
facilities on Friday. Therefore, Type 11 played significantly less than others.
Fig.3-4 Force-directed layout for Fri
7. Similarly ,
IDs 1392457, 1723510 of the 220th group of Type 11 did not play any facilities
on Saturday.
Fig.3-5 Force-directed layout for Saturday
8. Among the 1077 groups, IDs 521750,
644885, 1080969, 1600469, 1629516, 1781070, 1787551, 1935406, did not play any
facilities.
Fig.3-6 Force-directed layout for Sunday